Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 80
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Angew Chem Int Ed Engl ; 63(21): e202401189, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38506220

RESUMEN

This study introduces a novel approach for synthesizing Benzoxazine-centered Polychiral Polyheterocycles (BPCPHCs) via an innovative asymmetric carbene-alkyne metathesis-triggered cascade. Overcoming challenges associated with intricate stereochemistry and multiple chiral centers, the catalytic asymmetric Carbene Alkyne Metathesis-mediated Cascade (CAMC) is employed using dirhodium catalyst/Brønsted acid co-catalysis, ensuring precise stereo control as validated by X-ray crystallography. Systematic substrate scope evaluation establishes exceptional diastereo- and enantioselectivities, creating a unique library of BPCPHCs. Pharmacological exploration identifies twelve BPCPHCs as potent Nav ion channel blockers, notably compound 8 g. In vivo studies demonstrate that intrathecal injection of 8 g effectively reverses mechanical hyperalgesia associated with chemotherapy-induced peripheral neuropathy (CIPN), suggesting a promising therapeutic avenue. Electrophysiological investigations unveil the inhibitory effects of 8 g on Nav1.7 currents. Molecular docking, dynamics simulations and surface plasmon resonance (SPR) assay provide insights into the stable complex formation and favorable binding free energy of 8 g with C5aR1. This research represents a significant advancement in asymmetric CAMC for BPCPHCs and unveils BPCPHC 8 g as a promising, uniquely acting pain blocker, establishing a C5aR1-Nav1.7 connection in the context of CIPN.


Asunto(s)
Alquinos , Benzoxazinas , Metano , Metano/análogos & derivados , Metano/química , Metano/farmacología , Alquinos/química , Benzoxazinas/química , Benzoxazinas/farmacología , Benzoxazinas/síntesis química , Compuestos Heterocíclicos/química , Compuestos Heterocíclicos/farmacología , Compuestos Heterocíclicos/síntesis química , Humanos , Estereoisomerismo , Analgésicos/química , Analgésicos/farmacología , Analgésicos/síntesis química , Estructura Molecular , Catálisis , Descubrimiento de Drogas , Animales
2.
Bioinform Adv ; 4(1): vbae035, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38549946

RESUMEN

Motivation: PE/PPE proteins, highly abundant in the Mycobacterium genome, play a vital role in virulence and immune modulation. Understanding their functions is key to comprehending the internal mechanisms of Mycobacterium. However, a lack of dedicated resources has limited research into PE/PPE proteins. Results: Addressing this gap, we introduce MycobactERIal PE/PPE proTeinS (MERITS), a comprehensive 3D structure database specifically designed for PE/PPE proteins. MERITS hosts 22 353 non-redundant PE/PPE proteins, encompassing details like physicochemical properties, subcellular localization, post-translational modification sites, protein functions, and measures of antigenicity, toxicity, and allergenicity. MERITS also includes data on their secondary and tertiary structure, along with other relevant biological information. MERITS is designed to be user-friendly, offering interactive search and data browsing features to aid researchers in exploring the potential functions of PE/PPE proteins. MERITS is expected to become a crucial resource in the field, aiding in developing new diagnostics and vaccines by elucidating the sequence-structure-functional relationships of PE/PPE proteins. Availability and implementation: MERITS is freely accessible at http://merits.unimelb-biotools.cloud.edu.au/.

3.
ACS Chem Neurosci ; 15(6): 1063-1073, 2024 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-38449097

RESUMEN

Chronic pain is a growing global health problem affecting at least 10% of the world's population. However, current chronic pain treatments are inadequate. Voltage-gated sodium channels (Navs) play a pivotal role in regulating neuronal excitability and pain signal transmission and thus are main targets for nonopioid painkiller development, especially those preferentially expressed in dorsal root ganglial (DRG) neurons, such as Nav1.6, Nav1.7, and Nav1.8. In this study, we screened in virtual hits from dihydrobenzofuran and 3-hydroxyoxindole hybrid molecules against Navs via a veratridine (VTD)-based calcium imaging method. The results showed that one of the molecules, 3g, could inhibit VTD-induced neuronal activity significantly. Voltage clamp recordings demonstrated that 3g inhibited the total Na+ currents of DRG neurons in a concentration-dependent manner. Biophysical analysis revealed that 3g slowed the activation, meanwhile enhancing the inactivation of the Navs. Additionally, 3g use-dependently blocked Na+ currents. By combining with selective Nav inhibitors and a heterozygous expression system, we demonstrated that 3g preferentially inhibited the TTX-S Na+ currents, specifically the Nav1.7 current, other than the TTX-R Na+ currents. Molecular docking experiments implicated that 3g binds to a known allosteric site at the voltage-sensing domain IV(VSDIV) of Nav1.7. Finally, intrathecal injection of 3g significantly relieved mechanical pain behavior in the spared nerve injury (SNI) rat model, suggesting that 3g is a promising candidate for treating chronic pain.


Asunto(s)
Dolor Crónico , Indoles , Neuralgia , Ratas , Animales , Simulación del Acoplamiento Molecular , Canal de Sodio Activado por Voltaje NAV1.8 , Neuralgia/tratamiento farmacológico , Neuralgia/metabolismo , Ganglios Espinales/metabolismo
4.
J Chem Inf Model ; 64(4): 1407-1418, 2024 Feb 26.
Artículo en Inglés | MEDLINE | ID: mdl-38334115

RESUMEN

Studying the effect of single amino acid variations (SAVs) on protein structure and function is integral to advancing our understanding of molecular processes, evolutionary biology, and disease mechanisms. Screening for deleterious variants is one of the crucial issues in precision medicine. Here, we propose a novel computational approach, TransEFVP, based on large-scale protein language model embeddings and a transformer-based neural network to predict disease-associated SAVs. The model adopts a two-stage architecture: the first stage is designed to fuse different feature embeddings through a transformer encoder. In the second stage, a support vector machine model is employed to quantify the pathogenicity of SAVs after dimensionality reduction. The prediction performance of TransEFVP on blind test data achieves a Matthews correlation coefficient of 0.751, an F1-score of 0.846, and an area under the receiver operating characteristic curve of 0.871, higher than the existing state-of-the-art methods. The benchmark results demonstrate that TransEFVP can be explored as an accurate and effective SAV pathogenicity prediction method. The data and codes for TransEFVP are available at https://github.com/yzh9607/TransEFVP/tree/master for academic use.


Asunto(s)
Algoritmos , Proteínas , Humanos , Proteínas/química , Secuencia de Aminoácidos , Redes Neurales de la Computación , Aminoácidos
5.
Environ Sci Technol ; 58(10): 4662-4669, 2024 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-38422482

RESUMEN

Since the mass production and extensive use of chloroquine (CLQ) would lead to its inevitable discharge, wastewater treatment plants (WWTPs) might play a key role in the management of CLQ. Despite the reported functional versatility of ammonia-oxidizing bacteria (AOB) that mediate the first step for biological nitrogen removal at WWTP (i.e., partial nitrification), their potential capability to degrade CLQ remains to be discovered. Therefore, with the enriched partial nitrification sludge, a series of dedicated batch tests were performed in this study to verify the performance and mechanisms of CLQ biodegradation under the ammonium conditions of mainstream wastewater. The results showed that AOB could degrade CLQ in the presence of ammonium oxidation activity, but the capability was limited by the amount of partial nitrification sludge (∼1.1 mg/L at a mixed liquor volatile suspended solids concentration of 200 mg/L). CLQ and its biodegradation products were found to have no significant effect on the ammonium oxidation activity of AOB while the latter would promote N2O production through the AOB denitrification pathway, especially at relatively low DO levels (≤0.5 mg-O2/L). This study provided valuable insights into a more comprehensive assessment of the fate of CLQ in the context of wastewater treatment.


Asunto(s)
Amoníaco , Compuestos de Amonio , Amoníaco/metabolismo , Aguas del Alcantarillado/microbiología , Bacterias/metabolismo , Reactores Biológicos/microbiología , Oxidación-Reducción , Óxido Nitroso/análisis , Nitrificación , Compuestos de Amonio/metabolismo
6.
Artículo en Inglés | MEDLINE | ID: mdl-38190667

RESUMEN

Origins of replication sites (ORIs) are crucial genomic regions where DNA replication initiation takes place, playing pivotal roles in fundamental biological processes like cell division, gene expression regulation, and DNA integrity. Accurate identification of ORIs is essential for comprehending cell replication, gene expression, and mutation-related diseases. However, experimental approaches for ORI identification are often expensive and time-consuming, leading to the growing popularity of computational methods. In this study, we present PLANNER (DeeP LeArNiNg prEdictor for ORI), a novel approach for species-specific and cell-specific prediction of eukaryotic ORIs. PLANNER uses the multi-scale ktuple sequences as input and employs the DNABERT pre-training model with transfer learning and ensemble learning strategies to train accurate predictive models. Extensive empirical test results demonstrate that PLANNER achieved superior predictive performance compared to state-of-the-art approaches, including iOri-Euk, Stack-ORI, and ORI-Deep, within specific cell types and across different cell types. Furthermore, by incorporating an interpretable analysis mechanism, we provide insights into the learned patterns, facilitating the mapping from discovering important sequential determinants to comprehensively analysing their biological functions. To facilitate the widespread utilisation of PLANNER, we developed an online webserver and local stand-alone software, available at http://planner.unimelb-biotools.cloud.edu.au/ and https://github.com/CongWang3/PLANNER, respectively.

7.
Comput Biol Med ; 168: 107681, 2024 01.
Artículo en Inglés | MEDLINE | ID: mdl-37992470

RESUMEN

The multidrug-resistant Gram-negative bacteria has evolved into a worldwide threat to human health; over recent decades, polymyxins have re-emerged in clinical practice due to their high activity against multidrug-resistant bacteria. Nevertheless, the nephrotoxicity and neurotoxicity of polymyxins seriously hinder their practical use in the clinic. Based on the quantitative structure-activity relationship (QSAR), analogue design is an efficient strategy for discovering biologically active compounds with fewer adverse effects. To accelerate the polymyxin analogues discovery process and find the polymyxin analogues with high antimicrobial activity against Gram-negative bacteria, here we developed PmxPred, a GCN and catBoost-based machine learning framework. The RDKit descriptors were used for the molecule and residues representation, and the ensemble learning model was utilized for the antimicrobial activity prediction. This framework was trained and evaluated on multiple Gram-negative bacteria datasets, including Acinetobacter baumannii, Escherichia coli, Klebsiella pneumoniae, Pseudomonas aeruginosa and a general Gram-negative bacteria dataset achieving an AUROC of 0.857, 0.880, 0.756, 0.895 and 0.865 on the independent test, respectively. PmxPred outperformed the transfer learning method that trained on 10 million molecules. We interpreted our model well-trained model by analysing the importance of global and residue features. Overall, PmxPred provides a powerful additional tool for predicting active polymyxin analogues, and holds the potential elucidate the mechanisms underlying the antimicrobial activity of polymyxins. The source code is publicly available on GitHub (https://github.com/yanwu20/PmxPred).


Asunto(s)
Infecciones por Bacterias Gramnegativas , Polimixinas , Humanos , Polimixinas/farmacología , Polimixinas/química , Antibacterianos/química , Infecciones por Bacterias Gramnegativas/tratamiento farmacológico , Infecciones por Bacterias Gramnegativas/microbiología , Bacterias Gramnegativas , Farmacorresistencia Bacteriana Múltiple , Escherichia coli , Pruebas de Sensibilidad Microbiana
8.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37874948

RESUMEN

Proteases contribute to a broad spectrum of cellular functions. Given a relatively limited amount of experimental data, developing accurate sequence-based predictors of substrate cleavage sites facilitates a better understanding of protease functions and substrate specificity. While many protease-specific predictors of substrate cleavage sites were developed, these efforts are outpaced by the growth of the protease substrate cleavage data. In particular, since data for 100+ protease types are available and this number continues to grow, it becomes impractical to publish predictors for new protease types, and instead it might be better to provide a computational platform that helps users to quickly and efficiently build predictors that address their specific needs. To this end, we conceptualized, developed, tested and released a versatile bioinformatics platform, ProsperousPlus, that empowers users, even those with no programming or little bioinformatics background, to build fast and accurate predictors of substrate cleavage sites. ProsperousPlus facilitates the use of the rapidly accumulating substrate cleavage data to train, empirically assess and deploy predictive models for user-selected substrate types. Benchmarking tests on test datasets show that our platform produces predictors that on average exceed the predictive performance of current state-of-the-art approaches. ProsperousPlus is available as a webserver and a stand-alone software package at http://prosperousplus.unimelb-biotools.cloud.edu.au/.


Asunto(s)
Aprendizaje Automático , Péptido Hidrolasas , Péptido Hidrolasas/metabolismo , Especificidad por Sustrato , Algoritmos
9.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37291763

RESUMEN

BACKGROUND: Promoters are DNA regions that initiate the transcription of specific genes near the transcription start sites. In bacteria, promoters are recognized by RNA polymerases and associated sigma factors. Effective promoter recognition is essential for synthesizing the gene-encoded products by bacteria to grow and adapt to different environmental conditions. A variety of machine learning-based predictors for bacterial promoters have been developed; however, most of them were designed specifically for a particular species. To date, only a few predictors are available for identifying general bacterial promoters with limited predictive performance. RESULTS: In this study, we developed TIMER, a Siamese neural network-based approach for identifying both general and species-specific bacterial promoters. Specifically, TIMER uses DNA sequences as the input and employs three Siamese neural networks with the attention layers to train and optimize the models for a total of 13 species-specific and general bacterial promoters. Extensive 10-fold cross-validation and independent tests demonstrated that TIMER achieves a competitive performance and outperforms several existing methods on both general and species-specific promoter prediction. As an implementation of the proposed method, the web server of TIMER is publicly accessible at http://web.unimelb-bioinfortools.cloud.edu.au/TIMER/.


Asunto(s)
Bacterias , Redes Neurales de la Computación , Bacterias/genética , Bacterias/metabolismo , ARN Polimerasas Dirigidas por ADN/genética , ARN Polimerasas Dirigidas por ADN/metabolismo , Secuencia de Bases , Regiones Promotoras Genéticas
10.
IEEE/ACM Trans Comput Biol Bioinform ; 20(5): 3205-3214, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37289599

RESUMEN

It has been demonstrated that RNA modifications play essential roles in multiple biological processes. Accurate identification of RNA modifications in the transcriptome is critical for providing insights into the biological functions and mechanisms. Many tools have been developed for predicting RNA modifications at single-base resolution, which employ conventional feature engineering methods that focus on feature design and feature selection processes that require extensive biological expertise and may introduce redundant information. With the rapid development of artificial intelligence technologies, end-to-end methods are favorably received by researchers. Nevertheless, each well-trained model is only suitable for a specific RNA methylation modification type for nearly all of these approaches. In this study, we present MRM-BERT by feeding task-specific sequences into the powerful BERT (Bidirectional Encoder Representations from Transformers) model and implementing fine-tuning, which exhibits competitive performance to the state-of-the-art methods. MRM-BERT avoids repeated de novo training of the model and can predict multiple RNA modifications such as pseudouridine, m6A, m5C, and m1A in Mus musculus, Arabidopsis thaliana, and Saccharomyces cerevisiae. In addition, we analyse the attention heads to provide high attention regions for the prediction, and conduct saturated in silico mutagenesis of the input sequences to discover potential changes of RNA modifications, which can better assist researchers in their follow-up research.


Asunto(s)
Arabidopsis , Inteligencia Artificial , Ratones , Animales , Seudouridina , Arabidopsis/genética , Transcriptoma , Saccharomyces cerevisiae/genética , ARN/genética
11.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37369638

RESUMEN

Antimicrobial peptides (AMPs) are short peptides that play crucial roles in diverse biological processes and have various functional activities against target organisms. Due to the abuse of chemical antibiotics and microbial pathogens' increasing resistance to antibiotics, AMPs have the potential to be alternatives to antibiotics. As such, the identification of AMPs has become a widely discussed topic. A variety of computational approaches have been developed to identify AMPs based on machine learning algorithms. However, most of them are not capable of predicting the functional activities of AMPs, and those predictors that can specify activities only focus on a few of them. In this study, we first surveyed 10 predictors that can identify AMPs and their functional activities in terms of the features they employed and the algorithms they utilized. Then, we constructed comprehensive AMP datasets and proposed a new deep learning-based framework, iAMPCN (identification of AMPs based on CNNs), to identify AMPs and their related 22 functional activities. Our experiments demonstrate that iAMPCN significantly improved the prediction performance of AMPs and their corresponding functional activities based on four types of sequence features. Benchmarking experiments on the independent test datasets showed that iAMPCN outperformed a number of state-of-the-art approaches for predicting AMPs and their functional activities. Furthermore, we analyzed the amino acid preferences of different AMP activities and evaluated the model on datasets of varying sequence redundancy thresholds. To facilitate the community-wide identification of AMPs and their corresponding functional types, we have made the source codes of iAMPCN publicly available at https://github.com/joy50706/iAMPCN/tree/master. We anticipate that iAMPCN can be explored as a valuable tool for identifying potential AMPs with specific functional activities for further experimental validation.


Asunto(s)
Péptidos Catiónicos Antimicrobianos , Aprendizaje Profundo , Péptidos Catiónicos Antimicrobianos/farmacología , Péptidos Antimicrobianos , Antibacterianos , Algoritmos
12.
Comput Biol Med ; 163: 107155, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37356289

RESUMEN

The genome of Mycobacterium tuberculosis contains a relatively high percentage (10%) of genes that are poorly characterised because of their highly repetitive nature and high GC content. Some of these genes encode proteins of the PE/PPE family, which are thought to be involved in host-pathogen interactions, virulence, and disease pathogenicity. Members of this family are genetically divergent and challenging to both identify and classify using conventional computational tools. Thus, advanced in silico methods are needed to identify proteins of this family for subsequent functional annotation efficiently. In this study, we developed the first deep learning-based approach, termed Digerati, for the rapid and accurate identification of PE and PPE family proteins. Digerati was built upon a multipath parallel hybrid deep learning framework, which equips multi-layer convolutional neural networks with bidirectional, long short-term memory, equipped with a self-attention module to effectively learn the higher-order feature representations of PE/PPE proteins. Empirical studies demonstrated that Digerati achieved a significantly better performance (∼18-20%) than alignment-based approaches, including BLASTP, PHMMER, and HHsuite, in both prediction accuracy and speed. Digerati is anticipated to facilitate community-wide efforts to conduct high-throughput identification and analysis of PE/PPE family members. The webserver and source codes of Digerati are publicly available at http://web.unimelb-bioinfortools.cloud.edu.au/Digerati/.


Asunto(s)
Aprendizaje Profundo , Mycobacterium tuberculosis , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , Proteínas Bacterianas/genética , Virulencia/genética
13.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-37150785

RESUMEN

A-to-I editing is the most prevalent RNA editing event, which refers to the change of adenosine (A) bases to inosine (I) bases in double-stranded RNAs. Several studies have revealed that A-to-I editing can regulate cellular processes and is associated with various human diseases. Therefore, accurate identification of A-to-I editing sites is crucial for understanding RNA-level (i.e. transcriptional) modifications and their potential roles in molecular functions. To date, various computational approaches for A-to-I editing site identification have been developed; however, their performance is still unsatisfactory and needs further improvement. In this study, we developed a novel stacked-ensemble learning model, ATTIC (A-To-I ediTing predICtor), to accurately identify A-to-I editing sites across three species, including Homo sapiens, Mus musculus and Drosophila melanogaster. We first comprehensively evaluated 37 RNA sequence-derived features combined with 14 popular machine learning algorithms. Then, we selected the optimal base models to build a series of stacked ensemble models. The final ATTIC framework was developed based on the optimal models improved by the feature selection strategy for specific species. Extensive cross-validation and independent tests illustrate that ATTIC outperforms state-of-the-art tools for predicting A-to-I editing sites. We also developed a web server for ATTIC, which is publicly available at http://web.unimelb-bioinfortools.cloud.edu.au/ATTIC/. We anticipate that ATTIC can be utilized as a useful tool to accelerate the identification of A-to-I RNA editing events and help characterize their roles in post-transcriptional regulation.


Asunto(s)
Drosophila melanogaster , Edición de ARN , Animales , Ratones , Humanos , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , ARN/genética , Adenosina/genética , Adenosina/metabolismo , Inosina/genética , Inosina/metabolismo
14.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36880172

RESUMEN

Lysine 2-hydroxyisobutylation (Khib), which was first reported in 2014, has been shown to play vital roles in a myriad of biological processes including gene transcription, regulation of chromatin functions, purine metabolism, pentose phosphate pathway and glycolysis/gluconeogenesis. Identification of Khib sites in protein substrates represents an initial but crucial step in elucidating the molecular mechanisms underlying protein 2-hydroxyisobutylation. Experimental identification of Khib sites mainly depends on the combination of liquid chromatography and mass spectrometry. However, experimental approaches for identifying Khib sites are often time-consuming and expensive compared with computational approaches. Previous studies have shown that Khib sites may have distinct characteristics for different cell types of the same species. Several tools have been developed to identify Khib sites, which exhibit high diversity in their algorithms, encoding schemes and feature selection techniques. However, to date, there are no tools designed for predicting cell type-specific Khib sites. Therefore, it is highly desirable to develop an effective predictor for cell type-specific Khib site prediction. Inspired by the residual connection of ResNet, we develop a deep learning-based approach, termed ResNetKhib, which leverages both the one-dimensional convolution and transfer learning to enable and improve the prediction of cell type-specific 2-hydroxyisobutylation sites. ResNetKhib is capable of predicting Khib sites for four human cell types, mouse liver cell and three rice cell types. Its performance is benchmarked against the commonly used random forest (RF) predictor on both 10-fold cross-validation and independent tests. The results show that ResNetKhib achieves the area under the receiver operating characteristic curve values ranging from 0.807 to 0.901, depending on the cell type and species, which performs better than RF-based predictors and other currently available Khib site prediction tools. We also implement an online web server of the proposed ResNetKhib algorithm together with all the curated datasets and trained model for the wider research community to use, which is publicly accessible at https://resnetkhib.erc.monash.edu/.


Asunto(s)
Lisina , Procesamiento Proteico-Postraduccional , Animales , Ratones , Humanos , Lisina/metabolismo , Proteínas/metabolismo , Algoritmos , Aprendizaje Automático
15.
Methods Mol Biol ; 2624: 139-151, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36723814

RESUMEN

Pseudouridine is a ubiquitous RNA modification and plays a crucial role in many biological processes. However, it remains a challenging task to identify pseudouridine sites using expensive and time-consuming experimental research. To this end, we present Porpoise, a computational approach to identify pseudouridine sites from RNA sequence data. Porpoise builds on a stacking ensemble learning framework with several informative features and achieves competitive performance compared with state-of-the-art approaches. This protocol elaborates on step-by-step use and execution of the local stand-alone version and the webserver of Porpoise. In addition, we also provide a general machine learning framework that can help identify the optimal stacking ensemble learning model using different combinations of feature-based features. This general machine learning framework can facilitate users to build their pseudouridine predictors using their in-house datasets.


Asunto(s)
Seudouridina , ARN , ARN/genética , Aprendizaje Automático , Secuencia de Bases
16.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36528806

RESUMEN

Determining the pathogenicity and functional impact (i.e. gain-of-function; GOF or loss-of-function; LOF) of a variant is vital for unraveling the genetic level mechanisms of human diseases. To provide a 'one-stop' framework for the accurate identification of pathogenicity and functional impact of variants, we developed a two-stage deep-learning-based computational solution, termed VPatho, which was trained using a total of 9619 pathogenic GOF/LOF and 138 026 neutral variants curated from various databases. A total number of 138 variant-level, 262 protein-level and 103 genome-level features were extracted for constructing the models of VPatho. The development of VPatho consists of two stages: (i) a random under-sampling multi-scale residual neural network (ResNet) with a newly defined weighted-loss function (RUS-Wg-MSResNet) was proposed to predict variants' pathogenicity on the gnomAD_NV + GOF/LOF dataset; and (ii) an XGBOD model was constructed to predict the functional impact of the given variants. Benchmarking experiments demonstrated that RUS-Wg-MSResNet achieved the highest prediction performance with the weights calculated based on the ratios of neutral versus pathogenic variants. Independent tests showed that both RUS-Wg-MSResNet and XGBOD achieved outstanding performance. Moreover, assessed using variants from the CAGI6 competition, RUS-Wg-MSResNet achieved superior performance compared to state-of-the-art predictors. The fine-trained XGBOD models were further used to blind test the whole LOF data downloaded from gnomAD and accordingly, we identified 31 nonLOF variants that were previously labeled as LOF/uncertain variants. As an implementation of the developed approach, a webserver of VPatho is made publicly available at http://csbio.njust.edu.cn/bioinf/vpatho/ to facilitate community-wide efforts for profiling and prioritizing the query variants with respect to their pathogenicity and functional impact.


Asunto(s)
Aprendizaje Profundo , Humanos , Mutación con Ganancia de Función , Genoma
17.
Brief Funct Genomics ; 22(3): 274-280, 2023 05 18.
Artículo en Inglés | MEDLINE | ID: mdl-36528813

RESUMEN

Antiviral defenses are one of the significant roles of RNA interference (RNAi) in plants. It has been reported that the host RNAi mechanism machinery can target viral RNAs for destruction because virus-derived small interfering RNAs (vsiRNAs) are found in infected host cells. Therefore, the recognition of plant vsiRNAs is the key to understanding the functional mechanisms of vsiRNAs and developing antiviral plants. In this work, we introduce a deep learning-based stacking ensemble approach, named computational prediction of plant exclusive virus-derived small interfering RNAs (COPPER), for plant vsiRNA prediction. COPPER used word2vec and fastText to generate sequence features and a hybrid deep learning framework, including a convolutional neural network, multiscale residual network and bidirectional long short-term memory network with a self-attention mechanism to enable precise predictions of plant vsiRNAs. Extensive benchmarking experiments with different sequence homology thresholds and ablation studies illustrated the comparative predictive performance of COPPER. In addition, the performance comparison with PVsiRNAPred conducted on an independent test dataset showed that COPPER significantly improved the predictive performance for plant vsiRNAs compared with other state-of-the-art methods. The datasets and source codes are publicly available at https://github.com/yuanyuanbu/COPPER.


Asunto(s)
Aprendizaje Profundo , Virus de Plantas , ARN Interferente Pequeño/genética , Cobre , Interferencia de ARN , Plantas/genética , Virus de Plantas/genética , Antivirales
18.
Interdiscip Sci ; 15(1): 100-110, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-36350503

RESUMEN

Microsatellite instability (MSI), a vital mutator phenotype caused by DNA mismatch repair deficiency, is frequently observed in several tumors. MSI is recognized as a critical molecular biomarker for diagnosis, prognosis, and therapeutic selection in several cancers. Identifying MSI status for current gold standard methods based on experimental analysis is laborious, time-consuming, and costly. Although several computational methods based on machine learning have been proposed to identify MSI status, we need to further understand which machine learning model would favor identification for MSI and which feature subset is strongly related to MSI. On this basis, more effective machine learning-based methods can be developed to improve the performance of MSI status identification. In this work, we present MSINGB, an NGBoost-based method for identifying MSI status from tumor somatic mutation annotation data. MSINGB first evaluates the prediction performance of 11 popular machine learning algorithms and 9 deep learning models to identify MSI. Among 20 models, NGBoost, a novel natural gradient boosting method, achieves the overall best performance. MSINGB then introduces two feature selection strategies to find the compact feature subset, which is strongly related to MSI, and employs the SHAP approach to interpreting how selected features impact the model prediction. MSINGB achieves a better prediction performance on both the tenfold cross-validation test and independent test compared with state-of-the-art methods.


Asunto(s)
Neoplasias Encefálicas , Neoplasias Colorrectales , Humanos , Inestabilidad de Microsatélites , Neoplasias Colorrectales/genética , Mutación , Fenotipo , Repeticiones de Microsatélite
19.
Int J Mol Sci ; 23(23)2022 Nov 27.
Artículo en Inglés | MEDLINE | ID: mdl-36499167

RESUMEN

Neuropathic pain is a refractory chronic disease affecting millions of people worldwide. Given that present painkillers have poor efficacy or severe side effects, developing novel analgesics is badly needed. The multiplex structure of active ingredients isolated from natural products provides a new source for phytochemical compound synthesis. Here, we identified a natural product, Narirutin, a flavonoid compound isolated from the Citrus unshiu, showing antinociceptive effects in rodent models of neuropathic pain. Using calcium imaging, whole-cell electrophysiology, western blotting, and immunofluorescence, we uncovered a molecular target for Narirutin's antinociceptive actions. We found that Narirutin (i) inhibits Veratridine-triggered nociceptor activities in L4-L6 rat dorsal root ganglion (DRG) neurons, (ii) blocks voltage-gated sodium (NaV) channels subtype 1.7 in both small-diameter DRG nociceptive neurons and human embryonic kidney (HEK) 293 cell line, (iii) does not affect tetrodotoxin-resistant (TTX-R) NaV channels, and (iv) blunts the upregulation of Nav1.7 in calcitonin gene-related peptide (CGRP)-labeled DRG sensory neurons after spared nerve injury (SNI) surgery. Identifying Nav1.7 as a molecular target of Narirutin may further clarify the analgesic mechanism of natural flavonoid compounds and provide an optimal idea to produce novel selective and efficient analgesic drugs.


Asunto(s)
Productos Biológicos , Neuralgia , Canales de Sodio Activados por Voltaje , Ratas , Humanos , Animales , Productos Biológicos/farmacología , Productos Biológicos/uso terapéutico , Productos Biológicos/metabolismo , Células HEK293 , Ratas Sprague-Dawley , Neuralgia/tratamiento farmacológico , Neuralgia/metabolismo , Ganglios Espinales/metabolismo , Canales de Sodio Activados por Voltaje/metabolismo , Tetrodotoxina/farmacología , Células Receptoras Sensoriales/metabolismo , Analgésicos/farmacología , Analgésicos/uso terapéutico , Analgésicos/metabolismo , Canal de Sodio Activado por Voltaje NAV1.7/metabolismo
20.
Brief Bioinform ; 23(6)2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36341591

RESUMEN

Subcellular localization of messenger RNAs (mRNAs) plays a key role in the spatial regulation of gene activity. The functions of mRNAs have been shown to be closely linked with their localizations. As such, understanding of the subcellular localizations of mRNAs can help elucidate gene regulatory networks. Despite several computational methods that have been developed to predict mRNA localizations within cells, there is still much room for improvement in predictive performance, especially for the multiple-location prediction. In this study, we proposed a novel multi-label multi-class predictor, termed Clarion, for mRNA subcellular localization prediction. Clarion was developed based on a manually curated benchmark dataset and leveraged the weighted series method for multi-label transformation. Extensive benchmarking tests demonstrated Clarion achieved competitive predictive performance and the weighted series method plays a crucial role in securing superior performance of Clarion. In addition, the independent test results indicate that Clarion outperformed the state-of-the-art methods and can secure accuracy of 81.47, 91.29, 79.77, 92.10, 89.15, 83.74, 80.74, 79.23 and 84.74% for chromatin, cytoplasm, cytosol, exosome, membrane, nucleolus, nucleoplasm, nucleus and ribosome, respectively. The webserver and local stand-alone tool of Clarion is freely available at http://monash.bioweb.cloud.edu.au/Clarion/.


Asunto(s)
Núcleo Celular , Proteínas , ARN Mensajero/genética , Núcleo Celular/genética , Biología Computacional/métodos , Bases de Datos de Proteínas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...